Evaluation of Protein Structural Models Using Random Forests

نویسندگان

  • Renzhi Cao
  • Taeho Jo
  • Jianlin Cheng
چکیده

Protein structure prediction has been a “grand challenge” problem in the structure biology over the last few decades. Protein quality assessment plays a very important role in protein structure prediction. In the paper, we propose a new protein quality assessment method which can predict both local and global quality of the protein 3D structural models. Our method uses both multi and single model quality assessment method for global quality assessment, and uses chemical, physical, geo-metrical features, and global quality score for local quality assessment. CASP9 targets are used to generate the features for local quality assessment. We evaluate the performance of our local quality assessment method on CASP10, which is comparable with two stage-of-art QA methods based on the average absolute distance between the real and predicted distance. In addition, we blindly tested our method on CASP11, and the good performance shows that combining single and multiple model quality assessment method could be a good way to improve the accuracy of model quality assessment, and the random forest technique could be used to train a good local quality assessment model.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Comparison of Tourism Placement and Development Models from Land Use Planning perspective in Zagros Forests Case Study: Javanrud County

While in recent years, due to numerous reasons, the amount of travel and tourism has increased, the amount of problems caused by this activity is also considered by managers. By using presence points of tourists in Javanrud County, Analytic hierarchy process (AHP) and Random Forest (RF) models, the conditions of establishment of tourists from the aspect of land use planning was investigated. In...

متن کامل

Comparison of Random Survival Forests for Competing Risks and Regression Models in Determining Mortality Risk Factors in Breast Cancer Patients in Mahdieh Center, Hamedan, Iran

Introduction: Breast cancer is one of the most common cancers among women worldwide. Patients with cancer may die due to disease progression or other types of events. These different event types are called competing risks. This study aimed to determine the factors affecting the survival of patients with breast cancer using three different approaches: cause-specific hazards regression, subdistri...

متن کامل

Evaluation of Nonlinear Height-Diameter Models of Two Important Species of Turkish Pine (Pinus brutia) and Mediterranean Cypress (Cupressus sempervirens var. horizontalis), in the Planted Forests

Knowledge about the relationship between tree height (H) and diameter at breast height (D) is crucial for forest planning, monitoring, biomass estimation, and forest stands dynamics description. In this study, 20 different height-diameter models were evaluated to estimate accurately the height of the trees of Pinus brutia and Cupressus sempervirens var. horizontalis species in Arabdagh region (...

متن کامل

Performance evaluation of forest management plans (Case study: Iranian Caspian forests)

The aim of this research was to measure the relative efficiency of forest management plans in north of Iran. In order to fulfill the research, data of 12 forest management plans were collected from the financial balance sheets of Shafaroud Forest Company during a ten years period. First of all, basic Data Envelopment Analysis (DEA) models (BCC and CCR) were used to determine the efficiency. The...

متن کامل

A Study on the Accuracy and Precision of Estimation of the Number, Basal Area and Standing Trees Volume per Hectare Using of some Sampling Methods in Forests of NavAsalem

   The present study aimed to investigate the accuracy and precision estimation of the number, basal area and volume of the standing trees by methods of random and systematic random sampling in the forests of West Guilan. The cost or inventory time was determined using the criteria (E%2 × T). Inventory was carried out by complete sampling (census) in an area of 52 hectares. The study area (sect...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • CoRR

دوره abs/1602.04277  شماره 

صفحات  -

تاریخ انتشار 2016